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Abstract Using random matrix techniques and the theory of Matrix Product States we show that 
reduced density matrices of quantum spin chains have generically maximum entropy. 



(N 



Jaynes' principle of maximum entropy |Jay57a[ Jay57b| gives a pretty satisfactory solution to the old 



problem of dealing with prior information in probability theory. Generalizing the old principle of indifference 
C of Laplace, it briefly states that among all possible probability distributions compatible with our prior 
information, the best choice is the one which maximizes the Shannon entropy. Apart from its important 
applications on decision theory, since its origin it has succeeded in giving a very useful information-theoretical 
(f) view of statistical mechanics, both classical |Jay57a| and quantum |Jay57b| -where the function to maximize 
is the von-Neumann entropy. As an easy illustration, given the average energy of a quantum system as 
cj prior information, the density matrix which maximizes entropy is exactly the thermal state associated to 
Qh that particular energy. This spirit has been recently recovered with great success in [PSW06 and further 
developed in a number of ways in |BCH+h1 ICralll ILPSW091 iLPSWlOj . 



To which extent the principle of maximum entropy can be extended to more and more general situations 
^ has been a very active and controversary field in the last half-century. For instance, very recently a series 
of theoretical and experimental works |CFM + 08l lFCM + 08l lTCF + li] seem to validate the principle in 
relaxation processes of quantum systems when focusing on a particular small subsystem -which, as argued 
in Figure [TJ is the most relevant situation. 

Another, even older, principle to assign prior probabilities in physical problems relies on the symmetries 
{SI of the problem (see |Jay68| for a discussion). For instance, if one wants to incorporate in the problem 
CO some invariance, i.e. independence of the reference frame, this already reduces the class of prior probability 
distributions available. Indeed, if one has enough symmetries -they form a compact group- there is indeed 
1—1 a unique way of defining a prior distribution compatible with the symmetries -the Haar measure- and the 
^ problem is solved. 

^-H But what if one wants to incorporate to the problem some less standard knowledge? For instance, that 
^ the interactions in our model are local and homogeneous and that we work at zero temperature, but not 
•th any assumption on the particular interactions in the model itself. Note the difference with Jaynes' approach 
rN where the particular Hamiltonian of the model is known. Is there any way of incorporating this information 
C$ to the problem? Which is then the right prior probability? Is it related to maximizing some entropy? 
Since this type of assumptions are natural and widely accepted, solving these questions could be of upmost 
importance in quantum condensed- matter problems. In this paper we attack (and to some extent solve) 
them in the particular case of ID spin systems. 

To do that we will take advantage of the recent developments in the understanding of quantum spin 
chains, where it is nowadays widely well justified, both numerically |Whi92| and analytically [Has07j, that 
their ground states are exactly represented by the set of Matrix Product States (MPS) with polynomial 
bond dimension. We will concentrate in the situation of a chain with boundary effects in exponentially 
small regions of size b at both ends, homogeneity in the bulk and experimental access to an exponentially 
small central region of size I (see Figure [T]) . Tracing out the boundary terms leads to a bulk state given by 

d 

(1) P= E ^LA lb+1 ..-A iN _ b RA) N _ b ... 

ib+l,---iN-b,jb+l>---jN-b = 1 
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Figure 1. In this work we consider a chain of N sites, with homogeneous interactions in 
the bulk and boundary effects in exponentially small regions of size b at the borders. We 
assume that the experimentally accesible region (and hence the region we are interested in) 
is an exponentially small region of size I in the center of the chain. 



where all Ai, L > and R > are D x D matrices with D = poly(iV). This will be our starting point, that 
is, the prior information can be understood as restricting the bulk-states of our system as having the form 

Now, it is also known from the general theory of MPS |PGV WC06j that this set has a natural (over)para- 
metrization by the group U(dD), via the map U h-> Ai = (0\U\i). Being U(dD) a unitary group, one can use 
the symmetry-based assignment of prior distributions to sample from the Haar measure. Similarly, the fact 
that the map X i-)- ^2 i AiXA\ is trace-preserving leads to consider tr(i?) = l,||L||oo < 1, giving us natural 
ways of sampling also the boundary conditions (see below). One can therefore ask about which is then the 
generic reduced density matrix pi of I <C N sites. Note that, by the above comments, this is nothing but 
asking about generic observations of ID quantum systems. This idea has been already exploited for the 
non-translational invariant case in jGdOZIO] . The aim of the present work is to show that pi has generically 
maximum entropy: 

Theorem 0.1. Let pi be taken at random from the ensemble introduced with D > N 5 . Then \\pi/ trp/ — 
l/^lloo < (d l — l)V / cFO(D -1 / 10 ) except with probability exponentially small in D. 

Note that, since the accessible region I is exponentially smaller than the system size, the bound can be 
made arbitrary small while keeping the size of the matrices D polynomial in the system size. 

To prove the theorem, we will rely on recent developments of random matrix theory, in particular on the 
graphical Weingarten calculus provided in [UN10J, and on a novel estimate of the Weingarten function. 

The paper is organized as follows. First we introduce the Matrix Product State formalism, then we 
introduce Weingarten function and calculus, then we introduce basic results of the concentration of measure 



phenomenon. Finally, in section |4J we prove theorem 0.1 using the tools already introduced together with 
a novel asymptotic bound of the Weingarten function. 



1. Random Matrix Product States 

In this section we just fix the notation, for a detailed exposition see |PGVWC06| . Let dim(i/^) = D 
and &im(HB) = d, our initial state p given by Equation Q can be expressed by means of the map S(X) : 
B{H A ) -> B(H A ® H B ) given by £{X) = £ A { XA\ ® simply as 

p = tr A [L£ n (R)], 



where one should understand the map acting only in A and creating the systems B in order from 1 to n. 

In the rest of this work we will be interested in the reduced state of the I consecutive central sites of the 
chain, where I « n = 2t + I, that, up to normalization, will be described as 

Pi = tr A , Bl ...B t ,B t+l+1 ...B n [L£ n (R)]. 

The general boundary conditions L and R come from tracing out the boundary sites as described in figure 
[T] MPS theory leads to consider them belonging respectively to the sets C = {L > : ||L||oo < 1 5 L G Mjj} 
and K = {R > : tr(R) = 1,R e M D }, where || • | loo means the usual operator norm. Diagonalizing 
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Figure 2. Graphical representation of pj, where big squares represent matrices and the 
small objects attached to them represent the tensors that form the matrices. Dark objects 
correspond to "ket" tensors, white objects correspond to "bra" tensors, squares are used for 
dimension d and circles for dimension D. Wires represent contraction rules between tensors. 

L = VKV ] and R = WQ.W ] we can parametrize C by [0, 1] D x U{D) and K by Si([0, x U(D), being 
§i([0, 1] D ) the set of D-event probability distributions. Again by symmetry considerations, this leads to 
sample L using the Lebesgue measure on [0, 1] D and the Haar measure on U(D) and to sample R using any 
permutational invariant measure on §l([0,l] D ) and the Haar measure on U(D). We finally recall from the 
introduction that the matrices A{ in Equation will be sampled from the Haar measure on U(Dd) via the 
parametrization U h-> A{ = (0\U\i). 

Summarizing, we take the ensemble of MPS defined by the tuple (U, L, R) = (U, V, W, A, O) where U, V 
and W are distributed with respect to the Haar measure in the respective unitary group, A is distributed 
according to the Lebesgue measure in [0, 1] D and according to any permutation invariant probability 
measure in $i([0, 1] D )- 

Now our problem can be rephrased as: 

Given (U, L, R) randomly chosen find the behavior of the normalized state corresponding to 

Pi(U,L,R) = tr A , Bl . M+l+1 ... Bn (LU ABl ■ ■ ■ U A>Bn (R®(\0)(0\)® n )Ul Bn ■ ■ ■ U^J. 

where the systems of the sites Bi are labeled from left to right and the unitary matrices are acting in the site 
indicated and the ancillary system A from right to left living the other sites invariant. 

This state is represented in the graphical level in figure [2] 

2. Weingarten Function and Calculus 

The Weingarten function was first introduce in Wei78], for a complete description of this function we 
refer to |Col03] . Here we just describe its main ingredients to focus on the graphical calculus introduced 
in [CNlOj . We will follow the standard notation of representation theory of symmetric groups. We denote 
by A h p that A is a partition of p, x X is the corresponding character of S p and s\,d(x) = s\ t d(x, ...,x) the 
corresponding Schur function, see |Ful97] . If a G S p we denote by \a\ the minimum number k such that a 
can be written as a product of k transpositions, #ct is the number of cycles in a and both quantities are 
related by the formula \a\ = p — #a. 

Definition The Weingarten function Wg(n, a) takes as inputs a dimension parameter n and a permutation 
a in the symmetric group S p and is given by 

(x A (i)) 2 xV) 



Wg(n,a) = -^J2 



Its importance relys on the following theorem from [Col03] , which tells us that the average of a monomial 
over the unitary group can be computed in terms of sums of Weingarten functions. 
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Theorem 2.1. Let n be a positive integer and i = (ii,...,i p ), i' = i' p ), j = (ji,-..,j p ) and j' = 

be p-tuples of positive integers from 1,2, ...,n. Then 

Uiij,...Ui„i„UjiA'...Ui'jidU= > (L,,/ (5,,,/ ...<L ,■/ Wain, ra^ 1 ). 

o",rSS p 

In [CN10J the authors introduce a graphical paradigm in order to simplify the computation of the average 
of polynomials over the unitary group. Consider a polynomial P{U) of degree p in U and U then 

(2) Mu(P(U))= C^Wgin^o-- 1 ), 

where the coefficients C( CT jT ) can be computed by the following procedure in figures [2j [3] and |4j One has 
to enumerate the matrices U and U respectively from 1 to p and for any two permutations a,r G S p we 
delete the U and U boxes and we connect the white square and circle in U{ with the white square and circle 
respectively in U a ^, and analogously with the black objects and the r permutation. Now, loops represent 
traces over the matrices involved in it. If there is no matrix involved in a loop then it represents the trace of 
the identity of the system. Finally, if there are paths that are not loops they translate into the contraction 
with the boundary conditions that appear in it. The number C( CT ,r) i s J us t the product of the values of all 
the contractions. Note that, as drawn in figure [2J a monomial in can be substituted by a monomial in 
U. 

3. Measure of Concentration Phenomenon 

In this section we introduce the basic results of the measure concentration phenomenon that we are going 
to use; for a detailed exposition see for example [MS861 ILedOlj . 

Definition Let (X, d) be a metric space with probability measure /i, its concentration function a(x,d,fi) is 
defined as 

U(x,d,(i)(r) = sup{l - n(A r );A C X,fx(A) > -}, r > 

where A r = {x £ X; d(x, A) < r} is the open r-neighbourhood of A (with respect to d). 

This definition allows to prove directly that almost all the images of a Lipschitz function concentrate 
around the median, where the concentration factor is given by the concentration function. Nevertheless, we 
are interested in the concentration around the mean which is a consequence of the other, as one can bound 
the distance between the median and the mean depending on the concentration function. 

Theorem 3.1 (Measure concentration phenomenon). Let F be a Lipschitz function on (X,d), and fj, a 
probability measure on (X,d), then 

fJl ({F>m tl (F)+r})<2a fM (r/\\F\\ Lip ), 

f,({F <^(F) - r}) <2a,(r/\\F\\ Lip ), 
where 1E^(F) is the mean of F with respect to ji and ||-F||l«p is the Lipschitz constant of F. 

When one is interested in the concentration properties of a family of spaces, what matters is the scaling 
of the concentration function depending on the parameter defining the family of spaces. Thus looking at the 
definition of concentration function one can prove that the concentration properties of two spaces behave at 
least as well as the worst one of the two. 

Proposition 3.2. Let n, v two probability measures on metric spaces (X, d) and (Y, 5) respectively. Then, 
if lix v is the product measure in X xY equipped with the ^-metric, a^xxY,d+s,^xv) ^ a {x,d,n) + a (Y,8,u) 

If we apply this proposition to the spaces we are interested in we have the following lemma. 

Lemma 3.3. Let /i be the Haar measure in (U(D),d2), the unitary group with the Hilbert- Schmidt distance. 
Let v be the Lebesgue measure in ([0, 1] D , d^), the hypercube with the maximum distance. Then, for any 



I 

Ju 



5 

k E N, i/ie product space (X,di), where X = U(kD) x U(D) x U(D) x [0,1] D and di is the l\ distance of 
the product space, with the product probability measure ij = /iX/ix^iXi/ has concentration function 

a (X,5,v)( r ) ^ ce~ Cnr \ 

where c and C are universal constants. 



Remark Note that in this lemma, that will be used to prove the concentration in theorem 4.7 we do not 
consider the space Si([0, 1] D ), as in all the theorems below the result holds independently of E S>i([0, 1] D )- 

4. Proof of the main Theorem 

Recall that we are considering the ensemble of MPS defined by the tuple (U, L, R) = (U, V, W, A, fi) 
where U, V and W are distributed with respect to the Haar measure in the respective unitary group, A 
is distributed according to the Lebesgue measure in [ 0, 1} D and Q according to any permutation invariant 



probability measure in $i([0, 1] D ). To prove Theorem 0.1 we need to compute the mean and the Lipschitz 
constant of the trace normalized version of f(p) = tr(pf(U,L,R)) over the introduced ensemble. The 
difficulty of this calculus comes from computing the mean of the function. To simplify our computations we 
will first compute the mean and the Lipschitz constant for both function f(p) and its normalization function 
g{p) = (tr pi(U, L, R)) 2 and then argue about the concentration of tr(p 2 Varm ) = f{p)/g{p). To compute the 
mean of f(p) we first need to give a novel asymptotic bound of the Weingarten function. 

Theorem 4.1. Let p, n and k be nonnegative integers such that p k < n. Then there exists a constant K 
depending only on k such that for any a E S p , 

Wg(n,a) < Kn'P-^ 1 - 2 ^. 

Proof. We recall, see [Ful97] , that for any partition A h p of the integer p, 

v A m p 

Pi t=l 

where Aj is an integer in {0, ...,p — 1}. Equivalently, the Weingarten function becomes 
Consider the function 

v 

f^.z^iUil-zX,))- 1 

i=i 

This function is holomorphic in a neighborhood of zero. Moreover 2 < k , since p 2 < n, we have, for any 
\z\ < p~ 2 , 

I/aOOI < e. 

As a consequence, writing f\(z) = J2i>o a i,^ zt ' we obtain the Cauchy estimate 



a-i,x < ep 2 \ 



But equation [3] implies that 



Wg(n,p, cj) = J n Yl X A (l)xV)(l + E a ^ n ")- 

P ' Xhp i>l 

Therefore the coefficient in n~' p ~ l has norm smaller than ep 2t . But all coefficients are zero until i = \a\ 
|Col03j . therefore 

2 2 

Wg(n,a) < n~ p -^e(l + — + ( — f + ...)p 2H . 

n n 

For n > 2, and since p k < n, this implies p 2 ' CT ' < n 2 ' "'^. Furthermore (1 + — + ( — ) 2 + ■••) can be bounded 
by a universal constant (5, for example). The result follows. 

□ 
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Figure 3. Computation of M^ t ji[tv(LRLR)] using Weingarten graphical calculus: for any 
two permutations a, /3 G S2 delete the U and U matrices, join the white circle of Ui with the 
white circle of U a u\, join the black circle of Ui with the black circle of Uau\. The number 



J a(i) 

Ct a p\ is the product of the traces involved in the new picture. 



To organize the computations and the reasoning used in the bound of the mean of f(p) we prove the 
following two lemmas. 

Lemma 4.2. Let a,/3,7 G S p , 
a) the quantity \~f~ l a^a~ l f3\ + 



b) If '7 1 a~/a 1 /3 = c and 7 "a' 7a 



/or i = 1, 
one. 



2n + 4 and 7 
..,4 



is an even number. 

' _1 /3 = c, then a'~ 1 a commutes with 7. 
(2n+l, 1, 2, ...,n,2n + 3)(2n + 2,n + l,n + 2, 2ra, 2n + 4) and a(2n + i) 



2n + i 



TTien i/ie function that takes (a, (3) to (g,h) with g = 7 1 a7« 1 and /i = /3a 1 is one to 



Proof, a) The result follows from the fact that the parity of \a/3\ is the same as the parity of \a\ + \/3\. 

b) If 7 _1 a7a _1 /3 = c and 7~ 1 a / 7a /_1 /3 = c, then 07a -1 = a!^(a'~ 1 . Multiplying by the inverse of the 
right hand side we have that 7~ 1 o/~ 1 a7a -1 c/ = 1 which happens if and only if a'~ 1 a commutes with 7. 

c) If a is fixed, the change h = /3a _1 is clearly one to one. Now, by b), the change g = 7 _1 a7a _1 is one 
to one if and only if 7~ 1 a7a~ 1 = 1 has only the trvial solution a = 1, which can be easily check to be the 
case by the definition of 7 and the constraints in a. □ 

Lemma 4.3. Let L G C and R G 1Z be given at random with respect to the measures introduced in Section 
For any Q G Si([0, l] D ), we have that E[tr(L)] = D/2, E[tr(L 2 )] = D/A, E[tr(i?)] = 1, E[tr(i? 2 )] < 1, 
E[tr(Li?)] = 1/2, E[tr(LRR)] < 1/2, E[tr(LLi?)] = 1/4, E[tr(LLHR)] < 1/4 and lE[tr(LRLR)] < 1/4 + 
1/4D. 

Proof. These averages are not difficult to compute directly, but they can also be computed using the graphical 
calculus described in section [2] as a warming up for the forthcoming computations. We compute the last 
one as example. 

E[ir(LRLR)] = E[tr '(V ~AV ] WQ.W ] V 'AV^WQW^)] = E[tr([/A[/ t ^C/A[/ t fi)], 

where A G [0, 1] D , Q, G $i([0, 1] D ) and U, V, W G U(D) and the second equation follows by the invariance of 
the Haar measure. Using that E A [tr A] = D/2, E A [tr(A 2 )] = D/A, tr(fi 2 ) < 1 and (trft) 2 = 1 together with 
the graphical calculus in figure [3] we get 

^ R [tr(LRLR)} =E A) n[ £ C (a ^Wg(D, ap~ l )} = 

= E A , n [((tr A) 2 tr(ft 2 ) + tr(A) 2 (tr n) 2 )Wg(D, (1)(2)) - ((tr A) 2 (tr fl) 2 + (tr A 2 )(tr n 2 ))Wg(D, (12))] 

< 1/4 + 1/4D. 

□ 

Now, we have all the ingredients in order to compute the averages of f(p) and g(p). 

Theorem 4.4. Let p\ be taken at random from the ensemble introduced. Then, for any Q G $i([0,l] D ) and 
U G U(dD), we have that 

^A,v,w(tr Pi(A,n,U,V,W)) = 1/2 



Proof. 

E A ,y,w(trMA,n,t7, V, W)) = E A (tr M VyW [VAV j £ n (WnW^)}) = 

= E A (tr^tr Bl) ... jBn [tr(A)^"(tr(0)^)]) = E A (tr A tr(A)^) = E A (tr(A)/ D) = 1/2. 

The first equality follows by linearity of the trace, the third because 1 A is the fixed point of tr B (£), the 
second and fourth just by computing the averages themselves. □ 

Note that in this proof we are averaging over V and W, and our expression is a polynomial of degree 
one with respect to them. This, together with the fact that the average is independent of U make it easy 
to compute the average. The bound of the other function is much more involved and makes use of the 
asymptotic bound of the Weingarten function and Lemmas 4.2 and 4.3 We state it. The proof will be given 
in Appendix [Aj 

Theorem 4.5. Let p\ be taken at random from the ensemble introduced and D > n 5 . For any Q, G $i([0, 1] D ), 
then 

M(tr fi) < -^ + Op- 1 '*). 

In order to apply the measure concentration phenomenon we only need to compute the Lipschitz constant 
of the functions we are interested in. The proof of the following theorem will be given in Appendix [Bj 

Theorem 4.6 (Lipschitz constants). For any Q G Si([0, 1] D ), let f(U, V, W, A) = (tr pi(U, V, W, A, Q)) 2 and 
g(U, V, W, A) = tr pf(U, V, W, A, fi) where 

Pl (u,v,w,A,n) = tr ABl ... BuBt+l+1 ... Bn (VAv^u ABl ■ ■ ■ u ABn (wnw^ ® (\o)(o\r n )u^ Bn ■ ■ ■ u\ Bx ). 

Then the Lipschitz constants of both functions are upper bounded by 4n + 10. 

Now we can show which is the behavior of the 2-Renyi entropy, or equivalently the purity of the normalized 
State PNorm = Pi/ ti(pi). 

Theorem 4.7. Let p\ be taken at random from the ensemble introduced with D > n 5 . Then tr( / o Arorm ) = 
tr pf j pi) 2 = l/d l + 0(D~ 1 ^ 5 ) except with probability exponentially small in D. 



Pro of. Putting together the measure concentration phenomenon 3.1, the bounds on the Lipschitz constant 
and the union bound, we have for all G $i([0, 1] D ) that except with probability c\e~ C2e D ' n 



4.6 



tr(p 2 ) < E(tr(p 2 )) + e and (trp) 2 > E((tr p) 2 ) - e 

both at the same time where c\ and C2 are universal constants. Thus, we can bound 

tr(p 2 ) IE(tr(p 2 )) + 6 E(tr(p 2 )) + 6 ^ + Q(D^ 5 ) + e 1 1/5 
(tip) 2 ~ E((trp) 2 )-e " (E(trp)) 2 -e ~ 1/4 -e d l 1 h 



whe re th e second inequality follows from Jensen's inequality, the third inequality follows from Theorem 4.4 
and 



4.5 



and in the last equality we have used that we can take e = 0(Z)~ 1 / 5 ). The result follows. □ 



Finally we can easily prove our main theorem, which bounds the distance between the reduced density 
matrix of a generic random MPS and the completely mixed state. 



Proof of theorem 0. 1 



PNorm = Pi/ 'tr pi is trace normalized, that is, its eigenvalues sum up to one. Thus, in order to have an 
eigenvalue of p as far as possible from l/d l , the distribution of eigenvalues optimizing this problem is the 
one that has one eigenvalue as small or big as possible and the rest all equal. In both cases the distance 
between this eigenvalue and l/d l is (d l — l)VaJO(D^ 1 ^ 10 ). □ 
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5. Conclusions 

In this work we have shown how reduced density matrices of small subsystems of translational invariant 
random MPS have generically maximum entropy. This can be read as recovering Jayne's principle of 
maximum entropy in the situation where the prior information to incorporate in the sampling procedure is 
the locality and homogeneity of the interactions. For that we have relied on the (well justified) fact that 
MPS are the right representation for ground states of one dimensional local Hamiltonians and in the natural 
way of sampling MPS based on the symmetry principle. 

We acknowledge Ion Nechita for very insightful discussions during the preparation of this manuscript and 
Mittag-Leffler institute for the organization of the 2010 fall in Quantum Information Theory where this work 
was initiated. C.G.G. and D.P.G.'s research was supported by EU grant QUEVADIS and Spanish projects 
QUITEMAD and MTM2011-26912. B.C.'s research was supported by ANR GranMa, ANR Galoisint and 
NSERC grant RGPIN/341303-2007. 
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Appendix A. Proof of Theorem 14.51 
We can take first the average in U and using equation [2] together with the bound of theorem 4.1 we have 
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Figure 4. Computation of E[/(tr /o 2 (£, R, U)) using Weingarten graphical calculus: for any 
two permutations a, ft G S^n delete the £/ and J7 matrices, join the white circle of Ui with 
the white circle of U a u\, join the black circle of Ui with the black circle of Upu\. The number 
C( a m is the product of the traces involved in the new picture. 



= ®LMC(i,i)]Wg(Dd, 1) + Y VL,R[C(a,p)}Wg(Dd, Pa' 1 ) < 

(a^)^(l,l)6S, 

< (Dd)- 2 "E L , B [C (l!l) ] + if ®LAC(a,P))(Dd)- 2n -VW a ^, 

where we are separating the case where a = j3 = 1 and using the known value of the Weingarten function 
Wg(Dd, 1) = (Dd)~ 2n . The reason to do so is that this term is the largest one in the sum (as it will become 
clear through the proof). 

In order to compute the coefficients C( aw s) we a PPly the graphical Weingarten Calculus to figure |4j That 
is, given a permutation a that links the white squares and circles of the U's with the white squares and 
circles in U s and a permutation /3 that links the black circles of the U's with the black circles in U s. We 
numerate the U matrices from left to right and top to bottom and the same for the U matrices. Moreover, 
we enumerate the matrices L as 2n + 1 and In + 2 and the matrices R as 2n + 3 and 2n + 4. 

Now the links (from left to right) between the circles of the matrices U's, L and R are given by the function 

/ 2n + l 1 2 ... n \ ( 2n + 2 n + 1 n + 2 ... 2n \ _ . . , 

7= -, no r,,o ,-i , r. , n o,^ - Une can add two extra (non- 

' V 1 2 3... 2n + 3J\n + ln + 2n + 3... 2n + 4 J y 

existing) links 7(2n + 3) = 2n + l and 7(2n + 4) = 2n + 2; that way 7 is a permutation. Analogously, we have 

the same permutation 7 for the U matrices. The permutation relating the links of the squares of U and U is 

r = (t + l,n + t + l)(t + 2,n + t + l)...(t + l,n + t + l). Besides, define a' = a(2n + l)(2n + 2)(2n + 3)(2n + 4) 

and = /3(2n + l)(2n + 2)(2n + 3)(2n + 4) as the permutation a, /3 but considering it as an element of 

5'2n+4- 

The number of loops relating the circles is #7 _1 a / 7/3 /_1 — 2 = 2n + 2 — |7 _1 q / 7/3'~ 1 |, taking into account 
those where L or R appears. Note that we are subtracting 2 loops (2n + l)(2n + 2) that we have added 
when including the (non-existing) links of the permutation 7. The number of loops relating the squares is 
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#ra = n — \ra\. All the loops are trivial and thus they correspond to the dimension of the system, except 
those where L or R appears, in which we will take averages. For a = j3 = 1, we have that 



Ei,fl[C(i,i)] = ^L,RMLfti{RfD 2n - 2 d 2n ~ l ] = l/4D 2n d 2n ~ l 



Taking averages in L and R and using the bounds from lemma 4.3, it can be shown, by inspection on all 
possible combinations of L and R in different loops, that it is enough to distinguish the following two cases: 

(1) a, (3 £ A2n, where A^n is the set of tuples where 2n + 3 and 2n + 4 are not in the same cycle of the 
permutation r y~ 1 a'^/a'~ 1 (3, that is, both R matrices appear in different loops. In this case we have 
that 

m LtR [C (atl3) } < l/4£)2n-|7- 1 «'7«'- 1 ^l d an-|T«| < jD 2n-| 7 -i Q ' 7 a'- 1 /3| d 2n > 

Making the change of variables h = f3a~ 1 , g = , y~ 1 a''ya'~ 1 that is proven to be one to one in lemma 



4.2, and denoting h' = h(2n + l)(2n + 2)(2n + 3)(2n + 4) we get that 



Y ^LAC {a ,p)]Wg(Dd^a- l )<K Y D -\ 9 h'^\-{3/ 5 )\h\ d -(s/ 5 )\h\ 



= K V D~^+K Y £>-\9h\-3/5\h\ < 
< K fe 3 ( (2n + 4)(2n + 3) )|q| + 2 g ( (2n + 4)(2n + 3) )|q| 2 g ( 2n(2n - 1) j < 

\|g|=l 2D \g\=0 1D \h\=l 2E>3/5 

, . , (2n + 4)(2n + 3) 2D 2n(2n - 1) 

< 77 7777 + 



2D - (2n + 4)(2n + 3) 2D - (2n + 4)(2n + 3) 2L> 3 / 5 - 2n(2n - 1] 

In the first inequality we just upper bound the number of permutations with a given number of 
transpositions, the second is just a geometric sum. As D > n 5 we get further 

Y m LjR [C {oijP) ]Wg{Dd, /3a- 1 ) < OiD- 1 / 5 ) 
(«,/3)eA 2n \{(i,i)} 

(2) a, (3 £ B2n, where B n is the set of tuples where 2n + 3 and 2n + 4 are in the same cycle of the 
permutation ^~ l a'^a'~ l f3, that is, both R matrices appear in the same loop. In this case we have 
that 

Applying the same change of variables and consider B' 2n the image of B>2n under the change of 
variables, we get 



Y ^ LA \C (a ^\Wg{Dd^ a - v )<K D^'-^-WWd-WW. 

(«,/3)efi 2 n (.9,h)eB' 2n 

In order to bound this sum one has to proceed more carefully. The proof follows by bounding 
independently over the different cases where: h = 1, h = (2n+3, 2n+4), h is a different transposition, 
and the rest of terms. For all these cases one has to take into account the properties of the elements 
in £>2„, that is, 2n + 3 and 2n + 4 belongs to the same cycle of gh'^ 1 and the parity of \gh'~ 1 \ + \ h\ 
that is proven in lemma |4~2~} Following this procedure one can prove that 

Y V L AC(a,p)]Wg(Dd, /3a" 1 ) < O^ 1 / 5 ) 

(a>,/3)eB 2n 



The result follows joining the two cases and the case a = (3 = 1 
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Appendix B. Proof of Theorem 14.61 
We use the notation || • [L for the Schatten p-norm. For the first function we have 

_ \ f(U,V,W,A,n)-f(U',V',W',A',n)\ \(tr Pl (U,V,W,A,n)) 2 - (tr Pl (U',V',W',A',n)) 2 \ 
Llv d((U,V,W,A),(U',V',W',A>)) " d((U,V,W,A),(U',V',W',A')) 

I (tr Pl (U, V, W, A, n) - tv Pl (U', V, W, A', n))(tr Pl (U, V, W, A, O) + tr Pl (U', V , W, A', Q))| 

d((U,V,W,A),(U',V',W',A>)) 
< 2| tr( P i(U, V, W, A, Q) - Pl (U', V, W, A', fl))| 2 tr [p^ V, W, A, O) - F', g^, A', Q)j 

d((C/,F,W,A),(*7',F',W",A')) " d((U,V,W,A),(U',V',W',A')) 

^ 2II1/A1 



2||FAFtC/ n (PFfiWt ^ (|0)(0|)® n )(C/t)" - VAVt[/ m (Wt ® (|0)(0|)® n )([/'t)™|| 1 



\\U - U'\\ 2 + ||y - V'\\ 2 + \\W - W'\\ 2 + \\A - A'Hoo 

where we are using standard inequalities and for the shake of simplicity we are denoting by the same U 
unitaries that are acting on different systems and with the same letter V a unitary and its tensor product 
with the identity. Adding and substracting terms and applying the triangular inequality we get 2n+5 terms 
of the following form 

2\\V*A*V^U* n (W*nW*^ ® (|0)(0|)® n )(C/*t)«|| 1 



\\U - U'\\ 2 + \\V -V'\\ 2 + \\W - W'\\ 2 + ||A - A'Hoo ' 
where any X* stands for X, X' or X — X' and in any term the latter only appears once. Then 



Lip ^ 



(4n + 10)|| V"* A*V^*U* n (W*ttW*t ® (|0}(0|)® n )(77*t) ? 



i 



\U - U'\\ 2 + || V" - V'\\ 2 + \\W - W% + \\A - A'\ 
Applying the inequality ||XY||i < || X\\\ \\Y ||oo we get 

^ U , ^All^ll-IIAIIooll^llooll^llooll^llooll^lllll^loolK^t) 

Lip <(4n + 10) — 



n || 

oo 



\\U - U'\\ 2 + \\V - V'\\ 2 + \\W - W'\\ 2 + \\A - A'Hoo 
Now, by the decomposition we have done, any term has only one norm in the numerator of the form of the 
ones in the denominator. The other norms in the numerator are trivially bounded by one. Thus we get 



\Li P <4n + 10 
For the second function we have 

I, I, = \g(U,V,W,A,n)-g(U',V',W>,A>,tt)\ 
mLip d((U,V,W,A),(U',V',W',A')) 

K \tr( P f(u,v,w,A,n) - P f(u',v,w',A',n))\ 

~ \\U - U'\\ 2 + \\V - V'\\ 2 + \\W- W'\\ 2 + \\A - A'Hoo " 
where the result follows using the same techniques. 



